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Nowadays, there are many smart parking lots using plate detection system to 
control in/out vehicles. However, the disadvantages of systems are a fixed 
environment and necessity of manual labor and requirement of checkpoints in 
entrances. To solve the problems, a novel algorithm for wide-angle detecting 
car number plate using warped planar object detection (WPOD-NET) and a 


modified support vector machine (SVM) system is proposed. Comparing to 


other models, the proposal improves not only the range of detection angle but 
also the accuracy of detecting in shady conditions. The results show that the 
accuracy of proposal model is up to 95.1% with 1000 testing images in 
various scenarios. 
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1. INTRODUCTION 

Nowadays, the automatic transportation detecting system (ATDS) that provides the data of vehicle 
registration numbers has been developed rapidly. Each car has an identified number to register to law 
enforcement office before using. Therefore, license plate recognition (LPR) plays an important role in vehicle 
management system. Deep learning has contributed to enhance the result of computer vision tasks such as 
object recognition and optical character recognition (OCR). The systems have several applications of traffic 
and security such as toll fee collection, detecting stolen vehicles, smart house, and parking management. LPR 
involves capturing the images from a digital camera, pre-processing, and adjusting them for predicting 
model. Output result is the car plate that appears in the original image. Besides, the system is capable of 
accommodating to user’s requests, for instance manage “smart” parking. 

However, most of LPR systems that are set up in Vietnam are designed for recognizing from a fixed 
view-mostly frontal vehicle in proper environment like indoor scenario or finite externality effects. 
Therefore, we need to develop the system to solve the problems especially in megacity like Hanoi or Ho Chi 
Minh where government will apply computer science in managing transportation infrastructure. In the paper, 
the proposed system using warped planar object detection (WPOD-NET) and a modified support vector 
machine (SVM) system is performed over various scenarios. As a result, the system is able to be detected 
Vietnam car plate number in different angles. 

The rest of the paper includes five parts and is organized is being as: Section 2 presents several 
related works. Section 3 presents the proposed model. Section 4 will evaluate the proposal model and analyze 
the results. In the final section, we give conclusions and future research directions. 
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2. RELATED WORK 

There are many algorithms for detecting car number plate using convolutional neural network 
(CNN) [1]-[13]. A machine learning model has been proposed based on WPOD-NET [1]. In the model, the 
authors detect license plate regions of images, regress one affine transformation and resemble them directly. 
The results are appropriate inputs for the second stage of detection process. CNN model is trained to detect 
objects (license plates) in different distortions and reformed them to the shape of rectangular that is viewed 
directly in the front of vehicles. WPOD-NET was built using insights from network single shot MultiBox 
detector (SSD) [2], you only look once (YOLO) [3], and spatial transformer networks (STN) [4]. SSD and 
YOLO are in charge of detecting and recognizing multiple objects at the same time. On the other hand, STN 
is responsible for finding non-rectangular areas although the multi-tasks are out of its possibility. The 
processing of WPOD-NET is described as: 

Firstly, the resized images are transformed into network. It will be extracted the characteristics of 
smallest region to determine containing the car plate. When feeding the input image into WPOD-NET, the 
ratio calculating between license plate size and car bounding box is much higher than that of oblique one. 
Therefore, it may lead to numerical instabilities in output results. Besides, the working principle of network 
is to use a matrix to transform the imaginary square into a rectangle license plate region. In the next stage, the 
factor of CNN working have positive results. 

In proposal model, CNN-WPOD-NET is used to have the highest accuracy of outputs. In the paper 
[5], [6] YOLO networks were used to find the region that contains the license plate. The advantage of YOLO 
network is that the time for detection and accuracy of result are slightly better than WPOD-NET (less than 
0.3 second). However, the drawback of their experiment only considers on-plane rotation. As a result, all 
case of oblique views has been missed without testing. Although YOLO networks work well in finding the 
whole object in good condition such as in frontal view or not skewing, the results of testing images with 
different oblique angles are not high. 

Duan et al. [7], Smith [8] the authors use Hidden Markov for OCR with the accuracy of the 
experiment in a specific environment and random ones of 97.82% and 97.19%, respectively. In the papers, 
authors consider three fixed angles to capture the images of object (frontal, left and right view) with range of 
30 degree. However, the model is not able to detect in case of bad conditions, for example shadow from other 
objects that affects on the plate and creates uneven color regions. On the other hand, our model has ability to 
solve the problems of [7], [8] as shown in Table 1. 

In [9]-[11], OCR model had been used for experiment. Objects of the paper is Greece car plate that 
have the similar database of characters and digits with mentioned experiments. Two-layer probabilistic neural 
network (PNN) gave the result of recognition of 86% in corresponding scenarios Li et al. [11] without 
skewed or distorted objects concerned. A failure of detection happened in Roy et al. [9] when objects change 
to digital image together. The drawback in [9] is similar to that of [8], [12] where the samples are taken 
image from frontal view. The result is not good when working under poor condition. Moreover, the shape of 
Greece plate is rectangle that is different from Vietnam car registration plate, and all of the samples are taken 
from the frontal side. Chen [13], accuracy of system is only 78% since authors do not apply any filters to 
reduce the noise before the recognition step. 

Berchmans and Kumar [14], authors used a self-organizing (SO) recognition to procedure for their 
experiment. The procedure consists of three steps, namely character categorization, topological sorting, and 
recognition. Collecting results was 93.7% combining for frontal view with a fixed viewpoint and complex 
scenarios. A salient drawback of paper is the mistake when detecting the same color with vehicles. Besides, 
the time of characters and digits recognition step is approximately 2 seconds. The same situation also 
happens with algorithm in [15] where the system takes more than 2 seconds for processing. Xie et al. [5], 
authors use two Yolo models for detecting and recognizing the characters and digits. However, the samples 
are on-plane rotation scene. Oblique examples are missed in their experiment. 

To solve the problems, we decide to choose the WPOD-NET to recognize the plate based on [5], 
[8], [9], [11]. Besides, authors [1] also use the same network but their dataset does not contain square plate 
Vietnam. The proposal model not only detects well in poor condition but also performs adequately in real 
environment in Vietnam. 


Table 1. Comparison of effective algorithms 


No. Comparison object Advance of algorithm 
1 WPOD-NET [9], [11] | Accuracy is 86% in most challenging scenarios containing oblique license plates 
2 SO [14], [15] Accuracy [14] is 93.7% while processing time [15] is only 2 seconds 
3 Algorithm [13] 78% of plate recognition without using noise reduction process 
4 Yolo [5] Using two Yolo models for detecting and recognizing characters and digits 


Int J Artif Intell, Vol. 10, No. 3, September 2021: 657 - 665 


Int J Artif Intell ISSN: 2252-8938 1 659 


3. PROPOSAL SYSTEM 
3.1. Overview 

Detecting car registration numbers is a small branch of detecting subject. It employs automated 
methods to verify or recognize the existence of car license plate based on the characteristics of region and 
shapes. By detecting a rectangle region that contains a group of digits and characters, system is able to find 
the object of interest (car number plate). System uses pre-trained model and provides information for two 
main sections, namely law enforcement and commercial applications. Detecting car plate number is used to 
control the transportation infrastructure and to reduce the damage from congestion to the national economy. 
Detection system plays an important role in measuring the daily route of vehicles that helps to find the 
solution for traffic management. In the big cities (Hanoi or Ho Chi Minh), police departments have to 
maintain the traffic safety and order. Street cameras are set up to supervise moving vehicles and report them. 
Detecting car registration numbers algorithm consists three major phases, namely bounding box (car plate), 
detecting the character and digits, and recognizing them as shown in Figure 1. Algorithms containing all 
listed phased are considered as fully automatic systems and give output results of license plate with the text 
of digits and characters as shown in Figure 2. In Figure 2(a), we give the result of detecting license plate with 
front view. Figure 2(b) is the identification result with different angles of license plate. The results show that 
several number plates are unrecognizable. 


Recognizing 
characters and 


Bounding box (car Detecting number 


number) and characters 
numbers 


Figure 1. Diagram of car registration numbers detection process 


Figure 2. Result of detecting license plate with digits and characters: (a) an example of detecting car plate, 
(b) different aspect of a plate 


3.2. Proposal system 

The proposal method includes three steps as shown in Figure 3. In Figure 3(a), we divide into three 
steps including detecting license plate, extracting characters and digits of plate, and recognizing them. 
Figure 3(b) shows more detail of performing steps. When we receive an input image, the first module 
(WPOD-NET) will find the area that has the highest confidence ratio to be the car plate. Since the red, blue, 
green (RGB) input image of WPOD-NET is setup the dimension from 288 to 608 pixels, images with 
different sizes are rescaled according to the designed configuration. After reforming, the module skews the 
detected object to the frontal view corresponding to [1] using T matrix to adjust the angle of characters and 
digits without losing their features. 

In the second step, several obstacles that appears on the plate after changing to grayscale image will 
appear. They have the approximate values to those of numbers and characters on the plate, and thus it is 
difficulty to process (problems miss in aforementioned paper). The solution is therefore proposed a practice: 
using gaussian mask of sigma equal to 10 combining with filters and a threshold. Areas that are not surpass 
the threshold will be set up to value | (white) or 0 (black), respectively. In area inside bounding box 
(predicting box), OpenCV and other libraries will normalize the object and eliminate noise and obstacles that 
create problems for the final step. The positive outputs of the second step that is rescaled to the size 30x60 
are taken to the SVM model to predict. 

The difference between proposal and [1] is that SVM model is first used as the recognition model, 
as shown in the first path in Figure 3. The purpose of changing detection model is to improve accuracy and 
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time processing of system. Normally, a detection system is designed by neural networks (NN). In contrast, 

the experiment uses a hybrid system combining two heterogeneous model for application of “smart” parking 

for significant reasons: 

— Small, required data source for SVM compared to NN. 

— Non-trivial parameter optimization (SVM just requires 2-3 parameters). 

— SVM is more interpretable than NNs. 

— Commercial product needs to low price without changing the accuracy of system since SVM is the best 
choice. 
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Figure 3. Three processing steps of proposal system: (a) detecting license plate, extracting characters, digits 
of plate, and recognizing steps, (b) detailed steps to implement blocks 


3.2.1. Recognizing car plate 

Detecting license plate, being the first phase has important role of following steps and directly 
affects to the output result. Most automated system using to detect car registration numbers performs for a 
fixed environment. Besides, the diversity of shape and place where car plate set up creates challenges, as 
shown in Figures 4 and 5 for the system to have the optimal output. Manual activities proceed to collect the 
information that requires from the beginning instead of totally replacing by the computer. 

In the paper, we choose WPOD-NET model to recognize the plate. WPOD-NET consists of 21 
convolutional layers and 14 of them are inside residual blocks. The size of all internal filter is 3x3 and ReLU 
activations are the algorithms using in the network excepting the detection block 4 max pooling layers (size 
2x2) with stride 2 that decreases the input image by a factor of 16. In the final box, there are two parallel 
layers to submit for two cases: one infers the probability that is operated by SOFTMAX function, one uses 
linear function. More details can be seen in [1]. 


3.2.2. Detecting characters and digits 

In the second phase, the automated system will use an algorithm to detect the characters and digits 
inside the object that is found in the previous phase. By using Python and library such as TensorFlow, open 
source computer vision library (OpenCV), numerical python (NumPy) or PIL to process the images before 
starting to detect, the region has the highest probabilities to be a digit or character. The results are shown in 
Figure 6. Figure 6(a) is an example of detecting characters and digits of license plate by using TensorFlow. 
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Figure 6(b) is the input images with different viewing angles. They will be used for identification and 
classification in the next steps. 


No Region of Binary image Detection result 
Interest 
Without 
Filter 
With 
Filter 
Figure 4. Result of region of interest in good and bad cases 
Frontal view 0-30 Degree 


30-60 Degree [Jn ie 30E 517.6 BN 306-51760 cr fF 
erases 51762 z — 
1 30E ss = —- : 


ee 


30-60 Degree 
917.62 
- re.) 

30E 617.62) 30E 217.62 

30E_b17 624) 30E 91762 

OE) BI7.62, Sn B72 


Frontal view 30-60 Degree 


Figure 5. Detecting result of different angles 


Figure 6. Result of detecting license plate in the second phase, (a) process of detecting characters and digits, 
(b) different cases of test sample 
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3.2.3. Recognizing characters and digits 

There are many methods to recognize figures and the most well-known libraries have been used 
SVM [16], [17] and tesseract optical character recognition (OCR) [18], [19]. 

— Tesseract OCR engine: This is a top engine in the world. It has been distributed with open-source Apache 
2.0 that supports recognition characters in images and extracts them into raw material, html, pdf, tsv. Its 
function can be used through API. Tesseract OCR is an open-source project starting by Hewlett-Packard. 
In 2018, the latest stable version 4.0.0 is based on long short-term memory (LSTM). LSTM is a famous 
form of recurrent neural network (RNN) and used to solve the text of arbitrary length. Furthermore, it 
supports many image formats and is gradually added a large number of languages. 

— SVM: Model analyzes data using for classification and regression. SVMs are considered as the highest 
classification accuracy as a binary classifier [20]-[22]. It is the learning technique that is considered an 
effective method for general purpose because of its high performance without adding other knowledges. 
At the beginning state, SVM finds the hyperplanes (decision boundaries) that classify the data. It 
performs to separate the largest possible fraction of points of the same class on one side while optimizing 
the distance from either class to hyperplane. This hyperplane is called optimal separating hyperplane 
(OSH) that minimizes the risk of misclassifying not only the examples in the training dataset but also the 
unseen example. There are several advantages of SVM model: i) It is a very good algorithms for the 
unknown database, ii) It is appropriate for specific working background similar to text classification, 
iii) It has great possibility in scaling to high dimensional data. 

Due to the fact that each region in the world uses a different font for characters and digits on the 
plate, we used a different dataset for Vietnam car plates. The recognition module is a support vector machine 
model. The primary reason for choosing an SVM model is that it only requires a smaller data source 
compared to a neural network. Besides that, an SVM model only requires three parameters to setup. SVM is 
popular in text classification tasks, where consider the norm is high-dimensional spaces. In this paper, a type 
of SVM is used for OCR module-C-SVM. 36 groups of characters and digits in binary format are separated 
by hyperplanes with penalty multiplier C equals | for outlier. 

There is an obstacle that affects to detecting results for poor image quality. When the features are 
extracted, the discrimination functions between each pair are learned by SVMs. Therefore, a binary tree 
structure to recognize the testing samples is proposed to construct in the paper. For detecting characters and 
digits, multi-class SVM is used to assign labels to instances. The approach to the problem creates a difficulty 
of multiple binary classification. The common method is to distinguish one object from all others. It is 
performed based on [17] that have the classifier with highest output function. 


4. SIMULATION AND RESULTS 
4.1. Setup 

WPOD-net combining with SVMs algorithm is used to detect the car registration numbers in the 
paper. The testing data consists of 1000 images of vehicles for reality scenarios. In our experiment, each of 
the object has three angles of license plate including on left, right, and in front of. There are four cases of bad 
conditions, namely in the evening, in the shadow of tree, lack of brightness, and faded numbers of plates. 
Several of them are slightly blurred or distorted. Distance from camera to plate is variance to consider and it 
is manifested through plate and original image. Most of them are in good condition with clear view. We 
divided two main groups of license plate, namely one for random object and one for a group with different 
angles of each car. Algorithm is performed by LG Gram Intel ® Core™ 15-7200 CPU @ 2.50GHz 2.71GHz 
with 7.86 GB RAM, 64-bit Windows 10. The ratio between training and testing dataset is 4:1. 

A few cases show that the system has a mistake in determining whether square plate and rectangle 
plates. As a result, the output results are not correct. Due to different angles of the objects, there are several 
mistaken shape classifications. 


4.2. Results 

There are several salient instances as shown in Table 2. Those are images of vehicle are captured in 
challenging scenarios by ourselves. We also use the filter to check the quality of images when apply 
algorithm. As a result, the algorithm failed to identify the characters and digits on plate without the filter as 
shown in Figure 4. On the other hand, several cases are identified fully the characters and digits with filter. 

In Tables 2 and 3, there are several cases that contains plate with complex scenarios. The first three 
cases are applied Gaussian mask to improve the output results for detection step where all characters and 
digits are recognized. No. 4 and 5 do not apply the noise filter, and thus result in recognition step is not good. 
No. 6 is the case that image is affected by streetlight. As a result, no. 4 is missed in predicting step. No. 7 is 
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an example of our system executing in shady condition with uneven brightness. By using filter, the output is 
100% correct recognition. 

Our goal of experiment is to recognize correct the whole string of characters and numbers of the 
plate. Final result contains 951 correct images with full sequence numbers and characters of objects. On the 
other hand, 49 incorrect results give outputs with mistaken numbers and characters or missing object due to 
conditional environment. As shown in Tables 2 and 3, most of incorrect results happen for 42 cases since 
recognition system has limitation in detecting characters and digits in plates that is transformed from oblique 
views. Besides, seven objects taking images from frontal views have incorrect outputs because of common 
errors such as mistaking in recognition of B and 8 or the brightness of environment. Figures 5 and 7 illustrate 
intuitive scenarios of smart parking. Processing time to detect the plate is from 0.7 to 1.2 seconds (depending 
on the change of environment and quality of images). However, the result of proposal is not good in several 
cases based on the weather condition, the blurring problem, or shadow of object covers. We will therefore 
improve the system to solve the problems and combine others advantaged networks [23]-[26] to improve the 
accuracy. 


Table 2. Result of proposal algorithm 
Type of object © Number Different scenarios Number of correct detections Ratio of correct Quality of object 
of Distance (m) Range of angle Distance (m) Range of angle detection (%) 
objects 0-1 1-2 0-30 30-60 0-1 1-2 0-30 30-60 


1. Frontal view 500 290 210 x be 288 205 x x 98.6 All objects are in 

2. Oblique views 500 240 260 300 200 210 248 284 174 91.2 original shape, 

2.1. Left View 200 100 100 125 75 79 96 113 62 86.5 without physical 

2.2. Right View 300 140 160 175 125 131 152 171 112 94.3 transformation 
TOTAL 1000 1000 951 95.1 


Table 3. Results of complex cases 
Region contains plate Binary image Characters and digits detected 
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Figure 7. The samples contain different distance and angle 


Proposing WPOD-NET combining SVM system for detecting car number plate (Phat Nguyen Huu) 


664 O ISSN: 2252-8938 


5. CONCLUSION 

This paper proposes the car numbers detection algorithm for parking systems. The system needs to 
improve the step of increasing standard of object detected images. In the real applications, environment plays 
an important role of detection. The factors include: the camera has low quality; the weather condition is bad; 
or the blurring problem or shadow of object covers the plate. All problems cause to decrease the resolution of 
plate detection system. The shadow creates different brightness in the plate that makes the system to be 
unable to normalize the images for further steps. Therefore, we will create an adaptive wavelet filter 
optimizing the pre-process module and combine with other networks to improve the accuracy for the 
proposal system. 
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